CS168: The Modern Algorithmic Toolbox Lecture #6: Stochastic Gradient Descent and Regularization

نویسندگان

Tim Roughgarden

Gregory Valiant

چکیده

Last lecture we covered the basics of gradient descent, with an emphasis on the intuition behind and geometry underlying the method, plus a concrete instantiation of it for the problem of linear regression (fitting the best hyperplane to a set of data points). This basic method is already interesting and useful in its own right (see Homework #3). This lecture we’ll cover two extensions that, while simple, will bring your knowledge a step closer to the state-of-the-art in modern machine learning. The two extensions have different characters. The first concerns how to actually solve (computationally) a given unconstrained minimization problem, and gives a modification of basic gradient descent — “stochastic gradient descent” — that scales to much larger data sets. The second extension concerns problem formulation rather than implementation, namely the choice of the unconstrained optimization problem to solve (i.e., the objective function f). Here, we introduce the idea of “regularization,” with the goal of avoiding overfitting the function learned to the data set at hand, even for very high-dimensional data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CS168: The Modern Algorithmic Toolbox Lecture #6: Markov Chain Monte Carlo

The previous lecture covered several tools for inferring properties of the distribution that underlies a random sample. In this lecture we will see how to design distributions and sampling schemes that will allow us to solve problems we care about. In some instances, the goal will be to understand an existing random process, and in other instances, the problem we hope to solve has no intrinsic ...

متن کامل

CS168: The Modern Algorithmic Toolbox Lecture #14: Markov Chain Monte Carlo

متن کامل

CS168: The Modern Algorithmic Toolbox Lecture #14: Linear and Convex Programming, with Applications to Sparse Recovery

Recall the setup in compressive sensing. There is an unknown signal z ∈ R, and we can only glean information about z through linear measurements. We choose m linear measurements a1, . . . , am ∈ R. “Nature” then chooses a signal z, and we receive the results b1 = 〈a1, z〉, . . . , bm = 〈am, z〉 of our measurements, when applied to z. The goal is then to recover z from b. Last lecture culminated i...

متن کامل

CS168: The Modern Algorithmic Toolbox Lecture #18: Linear and Convex Programming, with Applications to Sparse Recovery

متن کامل

CS168: The Modern Algorithmic Toolbox Lecture #5: Sampling and Estimation

This week, we will cover tools for making inferences based on random samples drawn from some distribution of interest (e.g. a distribution over voter priorities, customer behavior, ip addresses, etc.). We will also learn how to use sampling techniques to solve hard problems— both problems that inherently involve randomness, as well as those that do not. As a warmup, to get into the probabilisti...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

CS168: The Modern Algorithmic Toolbox Lecture #6: Stochastic Gradient Descent and Regularization

نویسندگان

چکیده

منابع مشابه

CS168: The Modern Algorithmic Toolbox Lecture #6: Markov Chain Monte Carlo

CS168: The Modern Algorithmic Toolbox Lecture #14: Markov Chain Monte Carlo

CS168: The Modern Algorithmic Toolbox Lecture #14: Linear and Convex Programming, with Applications to Sparse Recovery

CS168: The Modern Algorithmic Toolbox Lecture #18: Linear and Convex Programming, with Applications to Sparse Recovery

CS168: The Modern Algorithmic Toolbox Lecture #5: Sampling and Estimation

عنوان ژورنال:

اشتراک گذاری